Inductive Learning in Less Than One Sequential Data Scan

نویسندگان

Wei Fan

Haixun Wang

Philip S. Yu

Shaw-hwa Lo

چکیده

Most recent research of scalable inductive learning on very large dataset, decision tree construction in particular, focuses on eliminating memory constraints and reducing the number of sequential data scans. However, state-of-the-art decision tree construction algorithms still require multiple scans over the data set and use sophisticated control mechanisms and data structures. We first discuss a general inductive learning framework that scans the dataset exactly once. Then, we propose an extension based on Hoeffding’s inequality that scans the dataset less than once. Our frameworks are applicable to a wide range of inductive learners.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequential Inductive Learning

In this paper I advocate a new model for inductive learning. Called sequential induction, this model bridges classical fixed-sample learning techniques (which are efficient but ad hoc), and worst-case approaches (which provide strong statistical guarantees but are too inefficient for practical use). According to the sequential inductive model, learning is a sequence of decisions which are infor...

متن کامل

Scaling Up Inductive Learning with MassiveParallelismFOSTER

Machine learning programs need to scale up to very large data sets for several reasons, including increasing accuracy and discovering infrequent special cases. Current inductive learners perform well with hundreds or thousands of training examples, but in some cases, up to a million or more examples may be necessary to learn important special cases with conndence. These tasks are infeasible for...

متن کامل

The Task Rehearsal Method of Sequential Learning

An hypothesis of functional transfer of task knowledge is presented that requires the development of a measure of task relatedness and a method of sequential learning. The task rehearsal method (TRM) is introduced to address the issues of sequential learning, namely retention and transfer of knowledge. TRM is a knowledge based inductive learning system that uses functional domain knowledge as a...

متن کامل

A dynamic model of reasoning and memory.

Previous models of category-based induction have neglected how the process of induction unfolds over time. We conceive of induction as a dynamic process and provide the first fine-grained examination of the distribution of response times observed in inductive reasoning. We used these data to develop and empirically test the first major quantitative modeling scheme that simultaneously accounts f...

متن کامل

Inductive Logic Programming Used to Discover Topological Constraints in Protein Structures

This paper describes the application of the Inductive Logic Programming (ILP) program GOLEM to the discovery of constraints in the packing of beta-sheets in alpha/beta proteins. These constraints (rules) have a role in understanding the protein folding problem. Constraints were learnt for four features of beta-sheet packing: the winding direction of two sequential strands, whether two consecuti...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Inductive Learning in Less Than One Sequential Data Scan

نویسندگان

چکیده

منابع مشابه

Sequential Inductive Learning

Scaling Up Inductive Learning with MassiveParallelismFOSTER

The Task Rehearsal Method of Sequential Learning

A dynamic model of reasoning and memory.

Inductive Logic Programming Used to Discover Topological Constraints in Protein Structures

عنوان ژورنال:

اشتراک گذاری